Goto

Collaborating Authors

 ghanaian language


The Ghanaian NLP Landscape: A First Look

arXiv.org Artificial Intelligence

Despite comprising one-third of global languages, African languages are critically underrepresented in Artificial Intelligence (AI), threatening linguistic diversity and cultural heritage. Ghanaian languages, in particular, face an alarming decline, with documented extinction and several at risk. This study pioneers a comprehensive survey of Natural Language Processing (NLP) research focused on Ghanaian languages, identifying methodologies, datasets, and techniques employed. Additionally, we create a detailed roadmap outlining challenges, best practices, and future directions, aiming to improve accessibility for researchers. This work serves as a foundational resource for Ghanaian NLP research and underscores the critical need for integrating global linguistic diversity into AI development.


NLP for Ghanaian Languages

arXiv.org Artificial Intelligence

In the much-applauded interventions by Google The advancement in machine learning computational and Microsoft through their translation services, power coupled with the recent investment quite a number of African languages have been within the domain by technological companies integrated, but Ghanaian languages are excluded has stimulated considerable interest and (Google, 2020; Microsoft, 2021). A historic move brought about a legion of applications in natural worth mentioning is Baidu Translate's incorporation language digitisation in developed countries, of the Twi language in their translation service.


Introducing ABENA: BERT Natural Language Processing for Twi

#artificialintelligence

In our previous blog post we introduced a preliminary Twi embedding model based on fastText and visualized it using the Tensorflow Embedding Projector. As a reminder, text embeddings allow you to convert text into numbers or vectors which a computer can perform arithmetic operations on to enable it reason about human language, i.e., carry out natural language processing (NLP). A screenshot of our fastText Twi embeddings from that exercise is shown in Figure 1. This model-- which we have shared in our Kasa Library repo -- enables a computer to begin to reason in Twi computationally. However it is "static" in the sense that the vectors do not change with different contexts. State-of-the-art NLP in high-resource languages such as English has largely moved away from these to more sophisticated "dynamic" embeddings capable of understanding a changing contexts.